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ABSTRACT 2. STATE-OF-THE-ART REVIEW 


The aim of this paper is to present the results of a 
CNES research project on distributed computing sys- 
tems. The purpose of this research was to study the 
impact of the use of new computer technologies in the 
design and development of future space applications. 

The first part of this study was a state-of-the-art 
review of distributed computing systems. One of the 
interesting ideas arising from this review is the con- 
cept of a "virtual computer" allowing the distributed 
hardware architecture to be hidden from a software 
application. 

The "virtual computer" can improve system perfor- 
mance by adapting the best architecture (addition of 
computers) to the software application without having 
to modify its source code. This concept can also 
decrease the cost and obsolescence of the hardware 
architecture. 

In order to verify the feasibility of the "virtual com- 
puter" concept, a prototype representative of a distrib- 
uted space application is being developed 
independently of the hardware architecture. 

Key Words: Distributed Computing, Distributed 
Architecture, Control Center. 

1. OVERALL APPROACH 

The motivation behind this research is the growing 
importance of distributed computing environments. 
First of all, a state-of-the-art review of distributed sys- 
tems was made so as to reveal the main underlying 
concepts. We then determined what innovative con- 
tributions could be made to the design and develop- 
ment of our space computing applications by such 
distribution concepts. Once these contributions were 
clearly identified, it was decided to validate them by 
applying them to the development of a prototype rep- 
resentative of a distributed space application. 


2.1 Concepts 

2.1.1 Definition 

There are many definitions of distributed systems. We 
chose the following one: "A distributed system is a 
system whose behaviour is determined by algorithms 
specifically designed to take into account and use sev- 
eral processing places". 

2.1.2 Client/server Model 

The customer/server model is certainly the most 
widespread concept in the literature dealing with dis- 
tributed architectures. The server defines services it 
makes available to the client. The client can access the 
server only through the set of services (functions) the 
server has decided to export. The server can serve sev- 
eral clients. 

2.1.3 Distributed programming techniques 

There are two categories of distributed programming 
techniques: 

— > Inter Process Communication (IPC) allowing two 
remote processes to communicate by sending mes- 
sages to each other. TCP/IP sockets are a good exam- 
ple of this. 



Inter Process Communication 


— > Remote Procedure Calls (RPC) allowing two 
remote processes to communicate in a different way, 
namely through the transmission of parameters. The 
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customer/server model can be implemented quite sim- 
ply using RPCs. The customer calls a server function 
using an RPC and then waits for the answer. The 
server carries out the processing and returns the 
results to the customer. One example of the use of 
RPCs is that of the SUN RPCs used to build the NFS 
distributed file system (Ref. 1). 
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2.1.4 The Virtual Computer 


real distributed system must provide this transparency 
for all the objects it manages (not only files, but also 
peripherals, processes, memory etc.). 

— > location transparency: a user or an application 
need not worry about the location of the objects he/it 
is handling. The NFS also offers this type of transpar- 
ency: nothing in the filenames indicates the location 
of these objects. 

— > concurrency transparency: several users or appli- 
cations may share a remote object without being 
aware of it. 

— > replication transparency: some objects are repli- 
cated without the application being aware of it. This is 
very useful for implementing hardware fault tolerance 
techniques by process replication. 

-> failure transparency: the occurrence of faults is 
masked to applications, or at least the work in 
progress is completed. 


Another concept highlighted in the state-of-the-art 
review is that of the virtual computer. This approach 
allows the hardware architecture to be masked to an 
application in order to give it the impression that it is 
being run on a centralized system. 



APPLICATION 


Distributed Operating System 
Virtual Computer 


~> migration transparency: objects can migrate from 
one computer to another without the application being 
aware of it. 

--> performance management transparency: the sys- 
tem can reconfigure itself dynamically in order to 
improve performance in a transparent manner. 

— > scaling transparency: the system or applications 
can change the execution scale (e.g.: increased num- 
ber of computers in a network) without having to 
change the algorithms. 



Virtual Computer 

The virtual computer concept is closely linked with 
the concept of transparency. Several transparency 
levels are defined in the ANSA project (Ref. 2): 

--> access-to-object transparency: an object (such as a 
file) may be accessed (i.e. opened, read, deleted etc.) 
in the same way whether locally or remotely. An 
example of a system offering access-to-file transpar- 
ency is the NFS (Network File System). However, a 
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The implementation of the different transparency lev- 
els can be used to define a real distributed operating 
system based on the concept of a virtual computer. 
Most Industry or Research products provide both 
access and location transparency. Some provide even 
more, but at this point in time, none of the systems 
investigated are able to provide all the different kinds 
of transparency mentioned above. 

2.1.5 Process groups 

This concept can be found in many of the systems 
investigated. A process group, as its name indicates, 
groups together several different processes. Its advan- 
tage is that all the processes belonging to the same 
group receive the same messages. 


— > integrated systems implementing functions related 
to distribution. 

application 



— > platforms or toolboxes located over the operating 
system. They provide users with a distributed environ- 
ment without masking the operating system. The most 
advanced are ISIS (Ref. 6), ANSA (Ref. 2), DELTA-4 
(Ref. 7) and OSF/DCE (Ref. 8). 



With this concept, message broadcast and above all 
fault tolerance can easily be implemented by replicat- 
ing the same processes on different sites. 

2.2 Systems investigated 

Having examined both Research and Industry prod- 
ucts, three types of system could be identified: 
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Platforms 


All current state-of-the-art systems provide at least 
access and location transparency. Some of them offer 
replication transparency (like ISIS), and others fault 
transparency (like DELTA-4). All these systems are 
oriented towards use with Unix. 


With respect to standardization, the OSF/DCE system 
would appear to be the most promising as it is sup- 
ported by the majority of Unix manufacturers. Unfor- 
tunately, OSF/DCE is not yet available and will 
definitely not be available before 1993. 

3. THE CONTRIBUTION OF THE CONCEPTS 
3.1 The virtual computer 


--> native systems integrating distribution at kernel 
level. The operating system is built over the kernel. 
The most technologically advanced kernels at present 
are Chorus (Ref. 3), Mach (Ref. 4) and Amoeba (Ref. 
5). 



application 

system 

kernel 



3.1.1 Adapting the architecture to the application 

When building a spacecraft control center, the manu- 
facturer and hardware architecture are both selected at 
the outset of the project, before software development 
begins. Sometimes, however, software development 
can last several years (1 to 5). 

One never knows before the application's validation if 
the hardware architecture will be efficient enough to 
run the application. If it is not efficient enough, the 
current configuration either has to be upgraded 
(through extra memory, CPU board, additional disks 
etc.), or the software code has to be optimized. If the 
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power of the CPU board cannot be upgraded, one or 
more extra computers have to be added, requiring a 
change in the architecture, and therefore a change in 
the application code. 

Using a distributed system implementing the concept 
of a virtual computer enables the architecture's distri- 
bution to be masked to the application. Thus, it is 
quite possible on integration to redistribute applica- 
tion components over another distributed architecture 
while maintaining its performance standards (addition 
of a computer) and without having to change the 
application code. 

3.1.2 Keeping one step ahead of architecture obsoles- 
cence 

Owing to the current developments in data processing 
technologies, the price/performance ratio of comput- 
ers is constantly decreasing such that the architecture 
selected at the beginning of the project is technically 
superseded by the end, and the resulting price/perfor- 
mance ratio is very poor. The project investment cost 
may appear to be relatively high compared with the 
architecture's real value at the time of validation. Fur- 
thermore, there may be a better architecture/manufac- 
turer pair at the time of the application's validation. 

One solution consists in choosing the target architec- 
ture after the application has been developed. Firstly, 
this requires the use of a standard (Unix) operating 
system in order to be independent of the manufac- 
turer. If this system implements the virtual computer 
concept, the application also becomes independent of 
the architecture. However, this poses many problems: 
firstly, if no architecture is chosen, on which comput- 
ers will the application be developed ? This problem 
can be solved by buying "low-quality" workstations 
which will be used exclusively for development work. 
Secondly, is it really possible to do without the spe- 
cific characteristics of a project (communication pro- 
tocols, fault tolerance etc.) which often determine the 
choice of architecture at the outset of the project ? 

3.2 The client/server model 

3.2.1 Simplifying the development of distributed ap- 
plications 

The design and development of a distributed applica- 
tion is quite complex: using tools such as TCP/IP 
sockets is not easy, and the final development of a dis- 
tributed application may even be distinctly difficult 
(error reproducibility difficulties, no final develop- 
ment tools etc.). 


The development of distributed applications can be 
simplified using the RPC -based client/server model. 
RPCs provide interface description languages and 
generators which allow the developer to concentrate 
exclusively on the development of server and cus- 
tomer functions without worrying about network 
communication. The server interfaces are clearly 
defined and, as a result, their final development 
becomes easier. 

3.3 Process groups 

3.3.1 Tolerating hardware faults 

Hardware fault tolerance acts as a brake upon distri- 
bution. It has a direct effect on both the architecture’s 
design (redundant computers) and development 
(reconfiguration scenarios, failure processing etc.). 

In some applications the problem is solved by: 

~> using fault-tolerant computers (Tandem, Stratus 
etc.) 

--> replicating computers so as to be able to reinitiate 
the application on the redundant computers. 

The solution based on fault-tolerant computers may 
be deemed expensive. Computer replication may be 
inappropriate as it requires that the application should 
be stopped and then reinitiated on another machine 
with the same hardware configuration. 

Fault tolerance problems can be solved efficiently - 
and without having to buy specific computers - with a 
distributed system implementing the process groups 
on replicated computers. Indeed, as the processes are 
replicated on distinct computers, the failure of a single 
computer does not affect the application’s operation. 

4. THE PROTOTYPE 

4.1 Objectives 

The objective of the prototype's development was the 
practical validation of the aforementioned concepts, 
namely the concept of a virtual computer, the client/ 
server model and process groups. 

Thus, the prototype was designed: 

~> to be independent of hardware architecture. It is 
hoped that our application will be able to run on one. 
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two or "x" number of computers without having to 
resort to recompilation. It was therefore decided to 
develop and validate our distributed application on a 
single computer before testing it on a distributed 
architecture. 

--> to be independent of the manufacturer by using 
Unix standards (portability). 

--> to tolerate computer hardware failures without 
affecting the application's operation. 

4.2 Functional characteristics 

It was decided to put the previous concepts into prac- 
tice in an application typical of the space environ- 
ment, and building a prototype inspired by spacecraft 
control centers appeared a judicious idea. 

The following functions were chosen: 

~> Telemetry acquisition and decommutation, 

~> Real-time telemetry monitoring, 

-> Control and monitoring of the distributed applica- 
tion, 

--> Real-time logbook, 

~> Off-line analysis of the logbook, 

--> Off-line telemetry processing. 

4.3 Development environment 

The development environment of the prototype con- 
sists of: 

~> the ISIS toolbox developed by Cornell University, 
to manage fault tolerance by using the process group 
concept. 

--> the EASY RPC product developed by the French 
company "Cap Gemini Innovation". This product, 
based on Sun RPCs, also provides both access and 
location transparency; it therefore enables an applica- 
tion to be developed independently of the architecture 
(access and location transparency). The EASY RPC 
product enables easy implementation of the client/ 
server model, thus facilitating the programming and 
final development of the distributed application. 

--> a Unix system complying with POSIX.l and 
XPG3 standards, to be independent of the manufac- 
turer. Whilst our application is intended to be devel- 
oped on a SUN4 workstation, the target architecture 
will actually comprise DEC, HP and SUN Unix com- 
puters. 


— > the C++ language for modular software develop- 
ment. 

— > the OSF/Motif system for multiwindowing com- 
bined with a graphic interface generator. 

5. CONCLUSION 

This research project is not yet completed, since the 
prototype is still being developed. Nevertheless, posi- 
tive results are expected and it is hoped that future 
developments in spacecraft control centers will inte- 
grate distribution concepts so as to be able to develop 
space applications which are flexible, upgradeable, 
hardware fault tolerant and above all independent of 
hardware architecture. 
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